Systematic database creation for expressive singing voice synthesis control
نویسندگان
چکیده
In the context of singing voice synthesis, the generation of the synthesizer controls is a key aspect to obtain expressive performances. In our case, we use a system that selects, transforms and concatenates units of short melodic contours from a recorded database. This paper proposes a systematic procedure for the creation of such database. The aim is to cover relevant style-dependent combinations of features such as note duration, pitch interval and note strength. The higher the percentage of covered combinations is, the less transformed the units will be in order to match a target score. At the same time, it is also important that units are musically meaningful according to the target style. In order to create a style-dependent database, the melodic combinations of features to cover are identified, statistically modeled and grouped by similarity. Then, short melodic exercises of four measures are created following a dynamic programming algorithm. The Viterbi cost functions deal with the statistically observed context transitions, harmony, position within the measure and readability. The final systematic score database is formed by the sequence of the obtained melodic exercises.
منابع مشابه
Expressive Control of Singing Voice Synthesis Using Musical Contexts and a Parametric F0 Model
Expressive singing voice synthesis requires an appropriate control of both prosodic and timbral aspects. While it is desirable to have an intuitive control over the expressive parameters, synthesis systems should be able to produce convincing results directly from a score. As countless interpretations of a same score are possible, the system should also target a particular singing style, which ...
متن کاملA multi-layer F0 model for singing voice synthesis using a b-spline representation with intuitive controls
In singing voice, the fundamental frequency (F0) carries not only melody, but also music style, personal expressivity and other characteristics specific to voice production mechanism. The F0 modeling is therefore critical for a natural-sounding and expressive synthesis. In addition, for artistic purposes, composers also need to have control over expressive parameters of the F0 curve, which is m...
متن کاملDesigning and Controlling a Source-filter Model for Naturalistic and Expressive singing voice synthesis
In this paper, we describe a voice synthesis model developed for musical purposes. Based on a source-filter model, this synthesizer has been specifically designed to allow the synthesis of natural sounding singing voices by including pitch and amplitude variations and by careful tuning of consonant to vowel transitions. A particular attention is given to the reproduction of plosive consonants. ...
متن کاملExpressive text-to-speech approaches
The core concern of this paper is the modelling and the tractability of expressiveness in natural voice synthesis. In the first part we quickly discuss the imponderable gap between natural and singing voice synthesis approaches. In the second part we outline a four level model and a corpus-based methodology in modelling expressive forms—an essential step towards expressive voice synthesis. We t...
متن کاملHMM-based Mandarin Singing Voice Synthesis Using Tailored Synthesis Units and Question Sets
Fluency and continuity properties are essential in synthesizing a high quality singing voice. In order to synthesize a smooth and continuous singing voice, the Hidden Markov Model-based synthesis approach is employed in this study to construct a Mandarin singing voice synthesis system. The system is designed to generate Mandarin songs with arbitrary lyrics and melody in a certain pitch range. I...
متن کامل